Model Selection

Wikipedia Pretraining

# Wikipedia Pretraining

NusaBERT Base Version is a multilingual encoder language model based on the BERT architecture, supporting 13 Indonesian regional languages and pretrained on multiple open-source corpora.

Large Language Model

Transformers Other

Multilingual Albert Base Cased 64k

A multilingual ALBERT model pretrained with masked language modeling (MLM) objective, supporting 64k vocabulary size and case sensitivity

Large Language Model

Transformers Supports Multiple Languages

Luke Japanese Wordpiece Base

A LUKE model improved from Japanese BERT, specifically optimized for Japanese named entity recognition tasks

Sequence Labeling

Transformers Japanese

Deberta V2 Base Japanese

A Japanese DeBERTa V2 base model pretrained on Japanese Wikipedia, CC-100, and OSCAR corpora, suitable for masked language modeling and downstream task fine-tuning.

Large Language Model

Transformers Japanese

Roberta Base Japanese With Auto Jumanpp

A Japanese pretrained model based on RoBERTa architecture, supporting automatic Juman++ tokenization, suitable for Japanese natural language processing tasks.

Large Language Model

Transformers Japanese

Deberta Base Japanese Wikipedia

DeBERTa(V2) model pretrained on Japanese Wikipedia and Aozora Bunko texts, suitable for Japanese text processing tasks

Large Language Model

Transformers Japanese

Albert Base Japanese V1 With Japanese Tokenizer

This is a Japanese-pretrained ALBERT model that uses BertJapaneseTokenizer as its tokenizer, making Japanese text processing more convenient.

Large Language Model

Transformers Japanese

Mluke Base Lite

mLUKE is a multilingual extension of LUKE, supporting text processing tasks in 24 languages

Large Language Model

Transformers Supports Multiple Languages

Bert Base Japanese Char

A BERT model pretrained on Japanese text using character-level tokenization, suitable for Japanese natural language processing tasks.

Large Language Model Japanese

Tiny Roberta Indonesia

This is a small RoBERTa model based on Indonesian language, specifically optimized for Indonesian text processing tasks.

Large Language Model

Transformers Other

mLUKE is a multilingual extension of LUKE, supporting named entity recognition, relation classification, and question answering tasks in 24 languages.

Large Language Model

Transformers Supports Multiple Languages

Bert Base Multilingual Cased Finetuned Polish Squad1

A Polish Q&A system fine-tuned on the multilingual BERT model, performing excellently on the Polish SQuAD1.1 dataset

Question Answering System Other

Bert Base Japanese

A BERT model pretrained on Japanese Wikipedia text, using IPA dictionary for word-level tokenization, suitable for Japanese natural language processing tasks.

Large Language Model Japanese

Bert Base Japanese Whole Word Masking

BERT model pretrained on Japanese text using IPA dictionary tokenization and whole word masking techniques

Large Language Model Japanese

Bert Large Japanese Char

BERT model pretrained on Japanese Wikipedia, employing character-level tokenization and whole word masking strategy, suitable for Japanese natural language processing tasks

Large Language Model Japanese

Bert Base Japanese V2

BERT model pretrained on Japanese Wikipedia using Unidic dictionary for word-level tokenization and whole word masking

Large Language Model Japanese

Bert Large Japanese

BERT large model pretrained on Japanese Wikipedia, utilizing Unidic dictionary tokenization and whole word masking strategy

Large Language Model Japanese

mLUKE is the multilingual extension of LUKE, supporting named entity recognition, relation classification, and question answering tasks in 24 languages.

Large Language Model

Transformers Supports Multiple Languages

Bert Base En Hi Cased

A lightweight customized version based on bert-base-multilingual-cased, supporting English and Hindi while maintaining original model accuracy

Large Language Model Other

Bert Base Ja Cased

A customized Japanese slim version based on bert-base-multilingual-cased, maintaining original accuracy

Large Language Model Japanese

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase